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How to compress а redundant file? 


000000000000000001001001000000000000000000100001000101000000 
000101000000000000000000000100000000010000000000000100000110 
000010000000000000001000000000001000010000010000000000000000 
000010000001000000000001000000000000000000000000010000000000 
000000010001000000100000010010000000000001000000001000000000 
000100100001010100000011000010000000000001000000000000000000 
000000000000000000000000010100000010000000000000100010100001 
100000000010000000000000000000000000000000000000000000000000 
010000000000000100000000000000000100001000000000110110000000 
101000101000000000000000000000000000100010100000100001000000 
000000000000100000100000100001001000000001010001000000000000 
000000000000000000000000011000000001010000000000100100010101 
000110010101000000000000100000000000000000100000001010000110 
100000001000000010110000000010000000010010000010000000000000 
100000000000000100000100000000000000100000000000000000000010 
000010001000000100000000000000100000000001000001000000000000 
110100000100100000000000000000001000000000000000001000000000 
100000000000000000000001000000010010001000000001100000011100 
000000000001000000000000000000000001000000000000001000000000 
000000000010000000000010100000100000000000010000000000001000 
101000100100010000000000000000000000000000000000001010000000 
010010000000000000000000000001000001100000010000000001000000 
000000001000100000000000010000000000000000000000000000001000 
000000000001000000010000000000000000000000000000000000000000 
000001000000000100100000000100000000100001000000100000000000 


ек. N = 1000 tosses of a bent coin with pi = 0.1 


How to measure information content? 


Claims: 1. The Shannon information content of an outcome 
1 
h(z=a;) = loga 55 
(oca) = = 
is a sensible measure of information content. 


2. The entropy 
1 
H(X) = P(x) logs 5735 
(x) = E Pi) loe io 


is a sensible measure of expected information 
content. 


(sketch h(p)) 
Point out additivity of h. 


Testing 'Shannon information content' claims 
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Shannon says 
Most informative‘ experiment is Ihe one with maximum entropy 


How to measure information content? 


Claims: 1. The Shannon information content of an outcome 
h(z=a;) = loga 55 
(oca) = = 

is a sensible measure of information content. 


2. The entropy 
1 
H(X) = P(z) logs 5735 
(x) = E Pi) loe f 


is a sensible measure of expected information 
content. 


3. Source coding theorem - 
N outcomes from a source X can ђе compressed 
into roughly N H(X) bits. 


The Bent Coin Lottery 


A coin with pı = 0.1 will be tossed N = 1000 times. 
The outcome is x = 1122... zn. 
св, X = 000001001000100. . .00010 


You can buy any of the 2 possible tickets for £1 each, 
before the coin-tossing. 


If you own ticket x, you win £1,000, 000, 000. 


Q To have a 99% chance of winning, 
at lowest possible cost, 
which tickets would you buy? 

• And how many tickets is that? 


Express your answer in the form 2077. 


Lottery 


ts ilable 


0000000000.....00000 
0000000000. 
0000000000. 
0000000000. 
0000000000. 
0000000000. 
0000000000. 
0000000000. 


001000000: 


1111111111 
111111111 
111111111 


Testing 'Shannon information content' claims 


@weighing problem (ОО 
(910) 


do 


@sixty-three 
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(ur): 101010 


Sixty-three 


We were able to give unique binary names to every 
one of 64 outcomes if each name was 6 bits long. 


c(42) = 101010 
S с(20) = 010100 


Generalize — 


Q If there are 5 possible outcomes, how many bits 
long must each name be, if each outcome has a 
unique name? 


Submarine 


Unhit aub. 


Mor: 0.02272 


개 10661 
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Information learntı 


== Зибтаппе — 


Unhit subs: 1  Unhit squares: 32 


ATOT: 0.71459 
мот: 0.79094 
мот: 0. 01071 


anseinput(-*): 
lottery 

endfor 

20: Lewis /home/sackay/itp/octave/ lottery» 

20: Levis; /home/mackay/ 1 tp/octave/ lottery» 

201 Lewis: /home /aschay/itp/octave/lottery> [] 


man ubmarine — 


Unhit squares: 19 


Unhit subs: 
T1000; 
for telit 
amveinput t); 
lottery 
endt 


201 Lewis: /home/mackay/itp/octave/ lottery 
1120: Lewis: /home/machay/ itp/oetave/lottarys 
20: Lewis: /home/machay/itp/octave/lottery» 


41803 
47643 
54086 


The Bent Coin Lottery 


A coin with pı = 0.1 will be tossed N = 1000 times. 
The outcome is x = 1122... zn. 
св, X 00000 1001000100... 00010 


You can buy any of the 2 possible tickets for £1 each, 
before the coin-tossing. 


If you own ticket x, you win £1,000, 000, 000. 


Q If you are forced to buy one ticket, 
which would you buy? 


Q To have a 99% chance of winning, 
at lowest possible cost, 
which tickets would you buy? 


e And how many tickets is that? 


Express your answer in the form 2077. 


Lottery 


ts ilable 


0000000000.....00000 
0000000000. 
0000000000. 
0000000000. 
0000000000. 
0000000000. 
0000000000. 
0000000000. 


001000000: 


1111111111 
111111111 
111111111 
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Source coding theorem 


3. Source coding theorem 


N outcomes from a source X can be compressed 
into roughly /ҮН(Х) bits. 


Proved by counting the typical set 


Read chapter 4 to see the full proof and the general definition of typicality 


How to compress а redundant file, 


practically? 


000000000000000001001001000000000000000000100001000101000000 
000101000000000000000000000100000000010000000000000100000110 
000010000000000000001000000000001000010000010000000000000000 
000010000001000000000001000000000000000000000000010000000000 
000000010001000000100000010010000000000001000000001000000000 
000100100001010100000011000010000000000001000000000000000000 
000000000000000000000000010100000010000000000000100010100001 
100000000010000000000000000000000000000000000000000000000000 
010000000000000100000000000000000100001000000000110110000000 
101000101000000000000000000000000000100010100000100001000000 
000000000000100000100000100001001000000001010001000000000000 
000000000000000000000000011000000001010000000000100100010101 
000110010101000000000000100000000000000000100000001010000110 
100000001000000010110000000010000000010010000010000000000000 
100000000000000100000100000000000000100000000000000000000010 
000010001000000100000000000000100000000001000001000000000000 
110100000100100000000000000000001000000000000000001000000000 
100000000000000000000001000000010010001000000001100000011100 
000000000001000000000000000000000001000000000000001000000000 
000000000010000000000010100000100000000000010000000000001000 
101000100100010000000000000000000000000000000000001010000000 
n1001000000000n0000000000000001000001100000010000000001000000 


Project: 
Invent a compressor and uncompressor for a 
source file of N = 10,000 bits, each having probability 
f = 0.01 of being a 1. 
Implement them and/or 
estimate how well your method works. 


Өлі: exercises 5.22, 5.26, 5.27 


(Reading: Chapters 1-6 


